EXISTENCE AND UNIQUENESS OF OPTIMAL TRANSPORT MAPS 
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Abstract. Let (X, d, m) be a non-branching Polish metric measure space. We show existence and 
uniqueness of optimal transport maps for cost written as non-decreasing and strictly convex functions 
of the distance, provided the space satisfies the measure contraction property. 



1. Introduction 

In |10j . Gaspard Monge studied the by now famous minimization problem 



(1.1) min / d(x,T(x))fio(dx), 

TjtjUo=/ii J 

on Euclidean space. This problem turned out to be very difficult because the functional and the corre- 
sponding set over which we minimize may both have a non-linear structure. 70 years ago, Kantorovich 
[8] came up with a relaxation of the minimization problem (jl.lj) . He allowed arbitrary couplings of the 
two measures /xq and [i\ and also more general cost functions c : X x X — > R are allowed: 



(1.2) min / c(x,y)q(dx,dy). 

geCpl(/i ,A*i) J 

Minimizers are called optimal couplings and, therefore, this family of problems is commonly called opti- 
mal transport problems. A natural and interesting question is when do these two minimization problems 
coincide, i.e. when is the or an optimal coupling given by a transportation map. In |3J, Brenier showed 
using ideas from fluid dynamics that on Euclidean space with cost function c(x, y) = \x — y\ 2 there is 
always a unique optimal transportation map as soon as /iq is absolutely continuous with respect to the 
Lebesgue measure. Soon after, McCann [3] generalized this result to Riemannian manifolds with more 
general cost functions including convex functions of the distance. By now, this result is shown in a wide 
class of settings, for non-decreasing strictly convex functions of the distance in Alexandrov spaces [3], 
for squared distance on the Heisenberg group [2J, and recently for the squared distance on CD (K,N) 
and CD(K, oo) spaces by Gigli [5] and for squared distance cost by Rajala and Ambrosio in a metric 
Riemannian like framework [T]. 

In this paper we show existence and uniqueness of optimal transport maps on non-branching metric mea- 
sure spaces satisfying a measure contraction property for cost functions of the form c{x,y) — h{d{x,y)), 
with h strictly convex and non-decreasing. In particular, our result recovers most of the previously men- 
tioned results and in some cases also extends them. For example, Juillet [7] shows that the Heisenberg 
group satisfies the measure contraction property. Thus, our result extends the previously known results 
on the Heisenberg group to the case of non-decreasing strictly convex cost functions. As a drawback 
of the generality we do not get structural results on the transport map, as being a gradient of a nice 
function. 

1.1. Notation and main result. Let (X,d,m) be a non-branching Polish metric measure space. Let 
G(X) be the set of geodesies endowed with the uniform topology inherited from C([0,1];A). Being a 
closed subset of C([0, 1]; A), it is Polish. For t £ [0, 1] consider the map e t : Q{X) — > X the evaluation at 
time t defined by e t (j) = jt- For a subset A C X and a point x £ X the t— intermediate points between 
A and x are defined as 

A t := e t ({ 7 £ Q(X) : 7o £ A, 71 = x}) . 
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We will assume that (X,d,m) satisfies MCP(AT, N), the so-called Measure Contraction property, for 
K £ M and N £ N with N > 2. The measure contraction property ensures the existence of a continuous 
function / : [0, 1] ->■ [0, 1] with /(0) = 1 such that 

m(A t ) > f(t) ■ m(A) . 

We omit the precise definition of / and especially its dependence on K and N as we will only use the stated 
property. It is worth noting that MCP(K, N) is implied by CD(K, N), see [T3]. For more information on 
M CP (if, AT) we refer to [XT] and [13]. 

Let Ho,fJ-i be probability measures over X. We study the following minimization problem 

min / h(d(x,T(x)))(J,o(dx), 

7(/io=Mi J 

where h : [0, oo) — > [0, oo) is a strictly convex and non-decreasing map. In the sequel, we will often denote 
the cost function hod just with c. Let II(/io, fj,i) be the set of transference plans, i.e. 

n(/z ,/ii) := {tt g V(X x X) : (P^tt = /x , (J^jtt = Mi}, 

where Pj : X x X —> X is the projection map on the i-th component, Pi(x\,X2) = x% for i = 1,2. We 
assume that /zo and /xi have finite c-transport distance in the sense that 

inf < / h(d(x,y))ir(dxdy) : tt £ II(/zo,/ii) > < oo. 
Uxxx J 

Recall that a transference plan tt £ II(/io, Mi) is said to be c-cyclically monotone if there exists T so that 
7r(F) = 1 and for every N £ N and every (xi, j/i, ) . . . , (xn, yjv) £ L it holds 

N N 

^2c(x i ,y l ) < 22c(x i+1 ,yi), 

i=l i=l 

with xat + i = xj. 

We will prove that, if /Uq <C m any c-cyclically monotone plan 7r is induced by a map T : X — > X. 
With tt induced by a map we mean that tt = (id,T)$/io, i.e. the two minimization problems (|1.1[) and 
(|1.2j) coincide. A direct Corollary of this result is the uniqueness of the optimal coupling. 

More precisely, we will show that branching at starting points does not happen almost surely. The 
key idea is to approximate the c-cyclically monotone set on which the optimal coupling is concentrated. 

2. Evolution estimates 

The probability measures /j,o, fJ-i are fixed. Since we have to prove a local property, we can assume that 
supp(/io), supp(/ii) C K with K compact. Then by standard results in optimal transportation, there 
exists a couple of Kantorovich potential (ip, ip c ) such that if 

r := {(x, y) £ X x X : tp{x) + <p c {y) = c(x, y)}, 

then transport plan tt is optimal iff 7r(r) = 1 (e.g. see Theorem 5.10 in |14j). 

We start by proving the standard property of geodesies belonging to the support of the optimal 
dynamical transference plan 77: they cannot meet at the same time t if t ^ 0, 1. For existence results and 
details on dynamical transference plans we refer to |14j chapter 7. 

Lemma 2.1. Let [xq, yo), (xi, yi) £ T be two distinct points. Then for any t £ (0, 1), 

d(x (t), Xl (t)) >0, 
where Xi(t) is any t -intermediate point between Xi and yi, for z = 0,1. 

Proof. Assume by contradiction the existence of xo(t) = x\{t) £ A", t- intermediate points of (20,2/0) an d 
(xi,yi), i.e. 

d(x ,x (t)) = td(x ,y ), d(x Q (t),y Q ) = (1 - t)d(x ,y ), 

and 

d(x 1 ,x 1 (t)) = td(xx,yx), d(xi(t),y±) = (1 - t)d{xx,yi). 
Case 1: d(xo,yo) 7^ d(x\,yi). Then 

h(d(x ,yi)) + h(d(x!,y )) < h(d(x ,x (t)) + d(xi(t),yi)) + h(d(xi,x x (t)) + d(x (t),y )) 
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< th{d(x , y )) + (1 - t)/i(d(a;i,2/i)) 

+ th(d(x 1 ,yi)) + (l-t)h(d(x ,yo)) 

= h(d(x 0> yo)) + /i(d(a?i,yi)). 

Where between the first and the second line we have used the strict convexity of h. From c-cyclical 
monotonicity we have a contradiction. 

Case 2. d(xo,yo) = d{x\,yi). Let 7°,7 1 £ G(X) be such that 

la=xo-, 7t° = ^o(*), 7i=2/o, 70=^1, 7 4 1= zi(t), 7f = 2/1, 

and define the curve 7 : [0, 1] — > X by 

= l%°, *e[o,t] 

Then 7 is a geodesic different from 7 but coinciding with it on the non trivial interval [0,t\. Since this 
is a contradiction with the non-branching assumption, the claim is proved. □ 

For any closed set A c X x X we can now consider the associated evolution map. For every t G [0,1] 
and every A C X compact set 

A t ,A :=e t ((e 0l e 1 )- 1 ({AxX)nA)). 
Note that by the Arzela-Ascoli Theorem, A tt A is a compact set. 

Theorem 2.2. There exists a continuous map f : [0, 1] — > [0, 1] with /(0) = 1 such that for any AcT 
compact the following inequality holds: 

(2.1) rn{A tk ) > f(t)m(A), Vie [0,1], 

for any A C -Pi (A) compact set, where A := (Pi (A) x P 2 (A)) (7 V. 

Proof. Step 1. Let {yijigN C P 2 (A) be a dense set in P 2 (A). 
Consider the following family of sets: for n G N and i < n 

E n (i) := {x G Pi (A) : c(x,yA - (p c (y t ) < c(x,yj) - (p c (yj),j = 1, • • ■ ,n}. 

If we now consider 

n 

A„ := (J P„(i) x {j/j}, 
t=i 

it is straightforward to check that Pi(A„) = Pi (A) and A„ is c-cyclically monotone. Indeed, for any 
(xx,yi), . . . , (x rn , y m ) G A„, by definition it holds 

c(xi, yi) - (p c (yi) < c(x it j/i+i) - <^ c (y 4+ i), i = 1, . . . , m. 

Taking the sum over i, the property follows. 

By MCP(A", N) there exists a continuous map / : [0,1] — > [0,1] with /(0) = 1, independent of the 
sequence {j/i}igN and of n, such that for any A C Pi (A) compact it holds 

m ((A n E n (i)) t ) > f(t)m(A n P„(i)), 

where (A n E n (i)) t = (An E n (i)) t E„(i)^{ yi }- Note that since A = Uj< n ^4 n E n (i) it follows 

A t , A „ = e t ((e , ei) _1 ((i4 x X) n A„)) 

= IJ et((e ,e 1 )- 1 (((AnP„( i )) X X) n A„)) 

= |J(Ans n (i)) t ,A„ 

d (J(AnP„(i)) t , B „ Wx{yi} . 

i<n 

Moreover from Lemma 12.11 it holds 

(A n B n (i)) t n(4n E n (j)) t = 0, i ^ j, 
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for all t 6 (0,1). 
Then it holds: 

(n \ n 

|J (AnE n (i)) tBn ^ x{yi} =J2 m (( AnE n(i)) t ) 
i=l / i=l 

n 

> /(()^m(4nB„(i)) 

i=l 

> f{t)m AnE n (i)J 

(2.2) = f(t)m(A). 

Step 2. Note that for every n £ N, A„ C supp(^io) x supp(/xi) an d the latter, by assumption, is a subset 
of K x A'. Since the space of closed subsets of K x K endowed with the Hausdorff metric (C(K x K), d-u) 
is a compact space, there exists a subsequence {A„ fc }fc 6 N and C K x K compact such that 

lim d H (A nk ,e) = 0. 

k— >oo 

Since the sequence {j/ijigN is dense in P 2 (A) and A C T is compact, by definition of E n (i), necessarily 
for every (x, y) £ it holds 

tp(x)+tp c (y) = c(x,y), zePi(A), y e ft (A). 

Hence 6 C (Pi (A) xP 2 (A))nr = A. 

To conclude the proof we have to observe that also d-n(A tt A rik , ^4t,e) converges to as k — s> oo. Then 

m(A t ,e) > limsupm^^A,^) . 

k— >oo 

Indeed, since A t .e is a compact set, it follows that if A^ e = {x g X : d(:r, ^4t,e) < e}> then for fc 
sufficiently big A 4j a„ C and m(Aj e ) converges to m(A t ,e). 
Then 

m(A t A ) > limsupm(A t , An J > f(t)m(A), 

and the claim follows. □ 

3. Existence of optimal maps 

In this section we show that branching at starting points does not happen almost surely. 

Lemma 3.1. Let Ai, A2 C T be compact sets such that 
i) ft(A 1 )=ft(A 2 ); 
ii) P 2 (Ai)nP2(A 2 ) = 0. 
Then m(Pi(Ai)) = m(Pi(A 2 )) = 0. 

Proof. Note that since P 2 (Ai) n P 2 (A 2 ) = 0, necessarily Ai n A 2 = 0. Hence from Lemma \2. 11 for every 
AcPi(A 1 ) = P 1 (A 2 ) 

A tM n A t,k 2 = ' 

for every t G (0, 1). Then let A := Pi(Ai) = Pi(A 2 ) and recall that as t —> the sets At^ and A 4j a 2 
both converge in Hausdorff topology to A. Put A e = {x : d(x, A) < e}. Then it follows from Theorem 
[221 that 

m(A) — limsupm(A e ) > limsupm(A tj A 1 U^aJ 
e^O t-tO 

= \irasup m(A ttAl ) + m(A ttA2 ) 

t-yO 

> 2m(A)limsup/(i) = 2m(A). 
t-»o 

Hence, necessarily m(Pi(Ai)) = m(Pi(A 2 )) = m{A) = 0, and the claim follows. □ 

We will use the following notation T(x) := ({x} x X) n T and given a set C X x X we say that T 
is a selection of if T : Pi(0) — > X is m-measurable and graph(T) C 0. 
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Proposition 3.2. Assume u "C m. Consider the sets 

E := {x G Pi(r) : is not a singleton} , T E := T D (E x X). 

TTien /or any selection T of Ye and every it G II(/io,Mi) with tt(T) =1 it holds 

7r(r B \ graph(T)) = 0. 

Proof. Step 1. Suppose by contradiction the existence of 7r G H(/j,o,fii) with 7r(F) = 1 and of a selection 
T of T E such that 

7r(r B \ graph(T)) = a > 0. 

By inner regularity of compact sets, to prove the complete statement it is enough to prove it under the 
additional assumptions that E is compact and T is continuous. 
Note that 

OO 

\ graph(T) = |J {(x, y) G Y E : d(y, T(x)) > 1/n} . 

n=l 

Hence, there exists n G N such that 

7T ({(a:, y) G T B : d(y, T(x)) > 1/n}) > a' > 0. 

5*iep From continuity of T, the set {(x,y) G : d(y,T(x)) > 1/n} is compact, hence there exists 
N G N and points (xi, yi), . . . , (xjy, yjv) S {(x, y) G : d(y, T(x)) > 1/n} such that 

JV 

{(x, y) G T E : %, T(s)) > 1/n} C |J ((*<,»)), 

i=l 

where on the product space X x X we consider the usual distance (d^ + (ij) 1 ^ 2 - Hence there exists a 
couple (xi,yi) such that 

7T (S 1/5n ((xuVi)) n {(a, y) G T E : d(y, r(x)) > 1/n}) > ^. 

Since the measure m restricted to K, as defined at the beginning of section [2] is doubling (see remark 
5.3 in [13]), we can also assume x±, . . . , xn to be points of Lebesgue m-density one in Pi({(x, y) G T E : 
d(y,T(x)) > 1/n}) (e.g. see Chapter 1 of [5])- Note indeed that the latter has positive m-measure. 

From the continuity of T it follows the existence of S > so that if d(x, Xi) < 5 then d(T(x), T(xi)) < 
l/5n. Hence 

T(B s ( Xi )) n P 2 (B 1/5n (( Xi , y 4 ))) = 0, 
indeed for any y G P 2 (B x / 5n ((xi, y t ))) and x G B s {xi) 

d (y, T ( x )) > d (yi, T (x)) - d(y, y l ) 

> d(yi,T(xi)) - d(T(xi), T{x)) - d(y, y t ) 

11 1 3 
~ n 5n 5n 5n 

Step 3. So consider 

Sj := graph(T) n (B s { Xi ) x X), S 2 := B^_ {{x llVl )) n {(x,y) G T E : d(y,T(x)) > 1/n}, 
where with B we intend the closed ball. By construction Si , S 2 C T and from Step 2. 

p 2 (Si)nP 2 (s 2 ) = 0. 

Then define A := Pi (Si) n Pi(S 2 ) and since X{ is a point of Lebesgue m-density one for Pi({(x, y) G T E : 
d(y,T(x)) > 1/n}), it follows that m(A) > 0. Hence the sets 

Ai := Si n (A x X), A 2 :=S 2 n(AxX) 

are so that: Pi(Ai) = Pi(A 2 ), P 2 (Ai) n P 2 (A 2 ) = while m(Pi(Ai)) > 0. Since this is in contradiction 
with Lemma l3.il the claim is proved. □ 

The proof of the following is now a straightforward corollary of what we proved so far. 
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Theorem 3.3. For any tt G n(/io, fJ-i) such that 7r(T) = 1 there exists an m-measurable map T : X — > X 
such that 

ir(graph(T)) = 1. 

Proof. Let it G H(ho,/j,i) be any transference plan so that tt(T) = 1. As for Proposition 13.21 consider the 
sets 

E := {x G Pi(r) : T(x) is not a singleton}, T E := m (E x X). 

Since 

Te = P12 ({(*, y,z,w)eTxT: d{x, z) = 0, d{y, w) > 0}) , 
the set Te is an analytic set. For the definition of analytic set, see Chapter 4 of [T2]. We can then use Von 
Neumann Selection Theorem for analytic sets, see Theorem 5.5.2 of [T5], to obtain a map T : E — !> X, 
^4-measurable, where A is the er-algcbra generated by analytic sets, so that (x,T(x)) G r^- 
Then Proposition 13.21 implies that 

7ri_r E = (Id, T)(j/ioLE. 

Since on T \ Te tt is already supported on a graph, the claim follows. □ 

This directly implies 
Corollary 3.4. There is a unique optimal transport map. 

Proof. The last theorem shows that every optimal coupling is induced by a transport map. As the set of 
all optimal couplings is convex this directly implies the uniqueness. □ 
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